ESRD03 Database and the Labeling for Environmental Sound Recognition
نویسندگان
چکیده
ESRD03 is the database collected from 21 sound effect CDs and RWCP (Sound Scene Database in Real Acoustic Environment) database for environmental sound recognition, including more than 16000 sound tracks and most of them happening in home environments. This paper introduces the database manipulation of segmenting and labeling the data manually. Firstly the tracks are categorized according to their differences on the object, material and motion of the sources. Then the tracks are segmented and labeled in each category. Except for the high level labeling on the basis of content description, which gives the natural description of the sound and is close to the human sense, the properties of event is also segmented and labeled if it can be sensed. Also several labeling schemes are designed in each category for the labeling of the different characteristics of the sound. And the sound can be labeled with different labeling schemes simultaneously. The scheme is designed for the reference of the automatic property extraction and evaluation, and the identification study. It is on the basis of the preliminary research on the environmental sound recognition and the results show the sense of the scheme.
منابع مشابه
A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملEffect of sound classification by neural networks in the recognition of human hearing
In this paper, we focus on two basic issues: (a) the classification of sound by neural networks based on frequency and sound intensity parameters (b) evaluating the health of different human ears as compared to of those a healthy person. Sound classification by a specific feed forward neural network with two inputs as frequency and sound intensity and two hidden layers is proposed. This process...
متن کاملDeveloping a Standardized Medical Speech Recognition Database for Reconstructive Hand Surgery
Fast and holistic access to the patients’ clinical record is a major requirement of modern medical decision support systems (DSS). While electronic health records (EHRs) have replaced the traditional paper-based records in most healthcare organization, the data entry into these systems remains largely manual. Speech recognition technology promises substitution of the more convenient speech-base...
متن کاملDifferent Profiles of Verbal and Nonverbal Auditory Impairment in Cortical and Subcortical Lesions
A B S T R A C T Introduction:We investigated differential role of cortical and subcortical regions in verbal and non-verbal sound processing in ten patients who were native speakers of Persian with unilateral cortical and/or unilateral and bilateral subcortical lesions and 40 normal speakers as control subjects. Methods: The verbal tasks included monosyllabic, disyllabic dichotic and diotic tas...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004